Search CORE

54 research outputs found

Coarse-Graining Auto-Encoders for Molecular Dynamics

Author: Gómez-Bombarelli Rafael
Wang Wujie
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/03/2019
Field of study

Molecular dynamics simulations provide theoretical insight into the microscopic behavior of materials in condensed phase and, as a predictive tool, enable computational design of new compounds. However, because of the large temporal and spatial scales involved in thermodynamic and kinetic phenomena in materials, atomistic simulations are often computationally unfeasible. Coarse-graining methods allow simulating larger systems, by reducing the dimensionality of the simulation, and propagating longer timesteps, by averaging out fast motions. Coarse-graining involves two coupled learning problems; defining the mapping from an all-atom to a reduced representation, and the parametrization of a Hamiltonian over coarse-grained coordinates. Multiple statistical mechanics approaches have addressed the latter, but the former is generally a hand-tuned process based on chemical intuition. Here we present Autograin, an optimization framework based on auto-encoders to learn both tasks simultaneously. Autograin is trained to learn the optimal mapping between all-atom and reduced representation, using the reconstruction loss to facilitate the learning of coarse-grained variables. In addition, a force-matching method is applied to variationally determine the coarse-grained potential energy function. This procedure is tested on a number of model systems including single-molecule and bulk-phase periodic simulations.Comment: 8 pages, 6 figure

arXiv.org e-Print Archive

DSpace@MIT

Simulations with machine learning potentials identify the ion conduction mechanism mediating non-Arrhenius behavior in LGPS

Author: Gómez-Bombarelli Rafael
Winter Gavin
Publication venue
Publication date: 27/11/2022
Field of study

_{10}

Ge(PS

_6

)

_2

(LGPS) is a highly concentrated solid electrolyte, in which Coulombic repulsion between neighboring cations is hypothesized as the underlying reason for concerted ion hopping, a mechanism common among superionic conductors such as Li

_7

_3

_2

_{12}

(LLZO) and Li

_{1.3}

_{0.3}

_{1.7}

(PO

_4

)

_3

(LATP). While first principles simulations using molecular dynamics (MD) provide insight into the Li

^+

transport mechanism, historically, there has been a gap in the temperature ranges studied in simulations and experiments. Here, we used a neural network (NN) potential trained on density functional theory (DFT) simulations, to run up to 40-nanosecond long MD simulations at DFT-like accuracy to characterize the ion conduction mechanisms across a range of temperatures that includes previous simulations and experimental studies. We have confirmed a Li

^+

sublattice phase transition in LGPS around 400 K, below which the \textit{ab}-plane diffusivity

D^*_{ab}

is drastically reduced. Concomitant with the sublattice phase transition near 400 K, there is less cation-cation (cross) correlation, as characterized by Haven ratios closer to 1, and the vibrations in the system are more harmonic at lower temperature. Intuitively, at high temperature, the collection of vibrational modes may be sufficient to drive concerted ion hops. However, near room temperature, the vibrational modes available may be insufficient to overcome electrostatic repulsion, thus resulting in less correlated ion motion and comparatively slower ion conduction. Such phenomena of a sublattice phase transition, below which concerted hopping plays a less significant role, may be extended to other highly concentrated solid electrolytes such as LLZO and LATP

arXiv.org e-Print Archive

Directory of Open Access Journals

Learning Pair Potentials using Differentiable Simulations

Author: Gómez-Bombarelli Rafael
Wang Wujie
Wu Zhenghao
Publication venue
Publication date: 15/09/2022
Field of study

Learning pair interactions from experimental or simulation data is of great interest for molecular simulations. We propose a general stochastic method for learning pair interactions from data using differentiable simulations (DiffSim). DiffSim defines a loss function based on structural observables, such as the radial distribution function, through molecular dynamics (MD) simulations. The interaction potentials are then learned directly by stochastic gradient descent, using backpropagation to calculate the gradient of the structural loss metric with respect to the interaction potential through the MD simulation. This gradient-based method is flexible and can be configured to simulate and optimize multiple systems simultaneously. For example, it is possible to simultaneously learn potentials for different temperatures or for different compositions. We demonstrate the approach by recovering simple pair potentials, such as Lennard-Jones systems, from radial distribution functions. We find that DiffSim can be used to probe a wider functional space of pair potentials compared to traditional methods like Iterative Boltzmann Inversion. We show that our methods can be used to simultaneously fit potentials for simulations at different compositions and temperatures to improve the transferability of the learned potentials.Comment: 12 pages, 10 figure

arXiv.org e-Print Archive

Chemistry-informed Macromolecule Graph Representation for Similarity Computation and Supervised Learning

Author: An Joyce
Gómez-Bombarelli Rafael
Mohapatra Somesh
Publication venue
Publication date: 03/03/2021
Field of study

Macromolecules are large, complex molecules composed of covalently bonded monomer units, existing in different stereochemical configurations and topologies. As a result of such chemical diversity, representing, comparing, and learning over macromolecules emerge as critical challenges. To address this, we developed a macromolecule graph representation, with monomers and bonds as nodes and edges, respectively. We captured the inherent chemistry of the macromolecule by using molecular fingerprints for node and edge attributes. For the first time, we demonstrated computation of chemical similarity between 2 macromolecules of varying chemistry and topology, using exact graph edit distances and graph kernels. We also trained graph neural networks for a variety of glycan classification tasks, achieving state-of-the-art results. Our work has two-fold implications - it provides a general framework for representation, comparison, and learning of macromolecules; and enables quantitative chemistry-informed decision-making and iterative design in the macromolecular chemical space.Comment: Main text: 4 pages, 2 figures, 1 table; Appendix: 18 pages, 25 figures, 3 table

arXiv.org e-Print Archive

DSpace@MIT

Differentiable sampling of molecular geometries with uncertainty-based adversarial attacks

Author: Gómez-Bombarelli Rafael
Schwalbe-Koda Daniel
Tan Aik Rui
Publication venue
Publication date: 28/03/2021
Field of study

Neural network (NN) interatomic potentials provide fast prediction of potential energy surfaces, closely matching the accuracy of the electronic structure methods used to produce the training data. However, NN predictions are only reliable within well-learned training domains, and show volatile behavior when extrapolating. Uncertainty quantification approaches can flag atomic configurations for which prediction confidence is low, but arriving at such uncertain regions requires expensive sampling of the NN phase space, often using atomistic simulations. Here, we exploit automatic differentiation to drive atomistic systems towards high-likelihood, high-uncertainty configurations without the need for molecular dynamics simulations. By performing adversarial attacks on an uncertainty metric, informative geometries that expand the training domain of NNs are sampled. When combined to an active learning loop, this approach bootstraps and improves NN potentials while decreasing the number of calls to the ground truth method. This efficiency is demonstrated on sampling of kinetic barriers and collective variables in molecules, and can be extended to any NN potential architecture and materials system.Comment: 12 pages, 4 figures, supporting informatio

arXiv.org e-Print Archive

DSpace@MIT

Directory of Open Access Journals

PubMed Central

Photocell optimization using dark state protection

Author: Fruchtman Amir
Gauger Erik M.
Gómez-Bombarelli Rafael
Lovett Brendon W.
Publication venue: 'American Physical Society (APS)'
Publication date: 05/08/2016
Field of study

This work was supported by the Leverhulme Trust (RPG-080). EMG is supported by the Royal Society of Edinburgh/Scottish Government. RGB thanks Samsung Advanced Institute of Technology for funding. AF thanks the Anglo-Israeli association and the Anglo-Jewish association for funding.Conventional photocells suffer a fundamental efficiency threshold imposed by the principle of detailed balance, reflecting the fact that good absorbers must necessarily also be fast emitters. This limitation can be overcome by "parking" the energy of an absorbed photon in a dark state which neither absorbs nor emits light. Here we argue that suitable dark states occur naturally as a consequence of the dipole-dipole interaction between two proximal optical dipoles for a wide range of realistic molecular dimers. We develop an intuitive model of a photocell comprising two light-absorbing molecules coupled to an idealized reaction centre, showing asymmetric dimers are capable of providing a significant enhancement of light-to-current conversion under ambient conditions. We conclude by describing a road map for identifying suitable molecular dimers for demonstrating this effect by screening a very large set of possible candidate molecules.PostprintPeer reviewe

arXiv.org e-Print Archive

Heriot Watt Pure

University of St. Andrews - Pure

St Andrews Research Repository

Automated patent extraction powers generative modeling in focused chemical spaces

Author: Gervaix Alexis
Greenman Kevin
Gómez-Bombarelli Rafael
Subramanian Akshay
Yang Tzuhsiung
Publication venue
Publication date: 02/06/2023
Field of study

Deep generative models have emerged as an exciting avenue for inverse molecular design, with progress coming from the interplay between training algorithms and molecular representations. One of the key challenges in their applicability to materials science and chemistry has been the lack of access to sizeable training datasets with property labels. Published patents contain the first disclosure of new materials prior to their publication in journals, and are a vast source of scientific knowledge that has remained relatively untapped in the field of data-driven molecular design. Because patents are filed seeking to protect specific uses, molecules in patents can be considered to be weakly labeled into application classes. Furthermore, patents published by the US Patent and Trademark Office (USPTO) are downloadable and have machine-readable text and molecular structures. In this work, we train domain-specific generative models using patent data sources by developing an automated pipeline to go from USPTO patent digital files to the generation of novel candidates with minimal human intervention. We test the approach on two in-class extracted datasets, one in organic electronics and another in tyrosine kinase inhibitors. We then evaluate the ability of generative models trained on these in-class datasets on two categories of tasks (distribution learning and property optimization), identify strengths and limitations, and suggest possible explanations and remedies that could be used to overcome these in practice

arXiv.org e-Print Archive

From free-energy profiles to activation free energies

Author: Diestler Dennis J.
Dietschreit Johannes C.
Gómez-Bombarelli Rafael
Hulm Andreas
Ochsenfeld Christian
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 24/08/2022
Field of study

Given a chemical reaction going from reactant (R) to the product (P) on a potential energy surface (PES) and a collective variable (CV) discriminating between R and P, we define the free-energy profile (FEP) as the logarithm of the marginal Boltzmann distribution of the CV. This FEP is not a true free energy. Nevertheless, it is common to treat the FEP as the “free-energy” analog of the minimum potential energy path and to take the activation free energy, ΔF‡ RP, as the difference between the maximum at the transition state and the minimum at R. We show that this approximation can result in large errors. The FEP depends on the CV and is, therefore, not unique. For the same reaction, different discriminating CVs can yield different ΔF‡ RP. We derive an exact expression for the activation free energy that avoids this ambiguity. We find ΔF‡ RP to be a combination of the probability of the system being in the reactant state, the probability density on the dividing surface, and the thermal de Broglie wavelength associated with the transition. We apply our formalism to simple analytic models and realistic chemical systems and show that the FEP-based approximation applies only at low temperatures for CVs with a small effective mass. Most chemical reactions occur on complex, high-dimensional PES that cannot be treated analytically and pose the added challenge of choosing a good CV. We study the influence of that choice and find that, while the reaction free energy is largely unaffected, ΔF‡ RP is quite sensitive

DigitalCommons@University of Nebraska